Skip to content

Binary parameter handling for GeoDjango #2169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

eduzen
Copy link

@eduzen eduzen commented Jul 27, 2025

Description

hey folks, this is my very first PR in the django-commons. I did this because I faced the issue myself in one project that I'm working on. I don't know how I can test this in that project. So I will wait for some of you to tell me how I can do that... but in the meantime this is the fix for the SQL Explain functionality for GeoDjango queries with binary parameters.

Fixes #423

SQL Explain feature throws a 500 error when used with GeoDjango queries containing binary parameters. This affects any query using PostGIS functions like ST_GeomFromEWKB(), ST_Distance_Sphere(), etc.
I'm trying to create a robust binary parameter handling system:

  1. Enhanced Parameter Encoding (tracking.py)
  • Detects binary data (bytes, bytearray) during parameter logging
  • Encodes binary data as base64 with a special marker: {"djdt_binary": "base64_data"}
  • Preserves exact binary integrity through JSON serialization
  1. Parameter Reconstruction (forms.py)
  • Added _reconstruct_params() function to decode stored parameters
  • Recursively handles nested structures (lists, dicts) containing binary data
  • Restores original binary data from base64 encoding
  1. Updated SQL Operations
  • Modified select(), explain(), and profile() methods to use parameter reconstruction
  • Added helper method _get_query_params() to eliminate code duplication

Checklist:

  • I have added the relevant tests for this change.
  • I have added an item to the Pending section of docs/changes.rst.

pre-commit-ci bot and others added 4 commits August 3, 2025 16:46
@eduzen
Copy link
Author

eduzen commented Aug 3, 2025

@tim-schilling I added some commits following what you were pointing out, let me know what do you think!

import json


class DebugToolbarJSONDecoder(json.JSONDecoder):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than the approach you've taken here, you may want to consider going with something like:

import base64
import json

def object_hook(obj):
    for key, value in obj.items():
        if isinstance(value, str) and value.startswith("__djdt_binary__"):
            _, encoded = value.split("__djdt_binary__", maxsplit=1)
            obj[key] = base64.b64decode(encoded)
    return obj


class DebugToolbarJSONDecoder(json.JSONDecoder):
    """Custom JSON decoder that reconstructs binary data during parsing."""
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        if self.object_hook is None:
            self.object_hook = object_hook

This will require you to change the tests. The decoding will only work if there's an actual object being passed in. Most of the tests are passing in a list of values. Using an object/dictionary is more representative of what will actually be used since this is for parameters for a SQL query which is a dictionary.

# Handle binary data (e.g., GeoDjango EWKB geometry data)
if isinstance(param, (bytes, bytearray)):
# Mark as binary data for later reconstruction
return {"__djdt_binary__": base64.b64encode(param).decode("ascii")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move this logic into a shared utility that way we don't have the magic value "__djdt_binary__" floating around the code-base. Plus it would be clearer how the logic gets paired together.

I'm also curious if we should be using this with the Store logic. If we do, then we could potentially ignore this and handle it within DebugToolbarJSONEncoder and then have DebugToolbarJSONDecoder as above. What do you think?

Comment on lines +61 to +63
self.assertEqual(len(reconstructed), 1)
self.assertEqual(reconstructed[0], binary_data)
self.assertIsInstance(reconstructed[0], bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a benefit to being this verbose. Try to be more concise such as:

Suggested change
self.assertEqual(len(reconstructed), 1)
self.assertEqual(reconstructed[0], binary_data)
self.assertIsInstance(reconstructed[0], bytes)
self.assertEqual(reconstructed, [binary_data])

Copy link
Member

@tim-schilling tim-schilling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for continuing to work on this! I have a few directional changes. If you want more specific feedback, I can provide it. I didn't want to be overly prescriptive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error debugging GeoDjango queries
2 participants